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Abstract 

Tests of statistical significance are widely used in educational 
and psychological research to facilitate interpretation of 
findings. But such tests do not reflect the degree of stability 
of findings across samples , and in multiple regression where 
resulting predictive equation effectiveness is subject to 
"shrinkage", it is especially important to evaluate result 
replicability. Indeed, since all parametric analytic methods are 
special cases of regression (just as all univariate and 
multivariate methods are special cases of canonical correlation 
analysis) , evaluating result replicability is important in all 
sorts of studies. Double cross-validation is an empirical method 
by which an estimate of invariance or stability can be obtained 
from research data in hand. This paper discusses the procedure 
for double cross-validation using both a heuristic data set and 
an actual research data set to illustrate both a nongeneralizable 
outcome and a generalizable outcome. 
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Double Cross-validation in Multiple Regression: 
A Method for Estimating the Stability of Results 

A growing trend in educational and psychological research 
has been the recognition that sole reliance upon statistical 
significance testing presents insufficient evidence to determine 
the importance of research findings (carver, 1978; Craig, Eison, 
& Metze, 1976; Thompson, 1989). The most compelling argument 
against statistical significance testing addresses the issue of 
sample size effects on the outcome of the null hypothesis test. 

More specifically, the larger the sample size, the greater 
is the likelihood that a null hypothesis will be rejected 
(Carver, 1978). As a result, if one's sample size is sufficient, 
then even the most trivial research findings will become 
statistically significant and thereby seem "important. On the 
other hand, it is likely that results based or small sample sizes 
may not be statistically significant, but may nevertheless reveal 
noteworthy results. Since the null hypothesis in social sciences 
research is seldom, if ever, exactly true, a sufficiently large 
sample size will almost always yield a statistically significant 
result (Fish, 1986; Sandler, 1987). 

An even more compelling problem with statistical 
significance testing is that it offers the researcher no 
indication of the replicability of the results, that is, the 
likelihood that such results would be reproduced in the future 
(Thompson, 1989). In essence, relying only on statistical 
significance in determining the merits of research findings 
"represents a corrupt form of the scientific method" (Carver, 



1978, p. 378). 

Given the limitations of statistical significance testing in 
determining the stability of research results, certainly the best 
predictor of result generalizability, or stability across 
samples, would be to conduct replications on as many samples as 
possible, thereby empirically validating result stability. But 
given that in educational and psychological research such a 
solution is often impractical, the stability of research results 
across samples can be estimated by using invariance techniques 
(Fish, 1986). As described by Englehard (1989), "invariance can 
also be viewed more broadly as the quest for generality in 
science" (p. 32) . Borrowing from Englehard' s discussion on the 
history of invariance, the concept of invariance within social 
sciences research is best described by Stevens (1951): 

The scientist is usually looking for invariance 
whether he knows it or not . . . The quest for 
invariant relations is essentially the aspiration 
toward generality, and in psychology, as in physics, 
the principles that have wide applications are those 
we prize, (p. 20) 
Because methods of invariance investigation depend to a 
large extent upon the analytic method used, the number of 
invariance procedures is quite large. The scope of the present 
papers focuses upon invariant procedures used with multiple 
regression analysis, and more specifically, double cross- 
validation. However, given that all statistical analytic methods 
are interrelated, the logic illustrated in the present paper can 
certainly be generalized to other analytic methods. 



Result Stability in Multiple Regression 
The Problem of Shrinkage 

In multiple regression analysis, the researcher seeks to 
find those independent variables (and their respective weights) 
that correlate roost highly with the dependent variable. The 
analysis produces weights that can be applied to the predictor 
variables to yield a predicted score, YHAT, for each subject. 
When the variables are all in z -score form these weights are 
called beta weights, and when the variables are unstandardized 
the weights are called b weights. 

The weights are developed subject to the restriction that 
the YHAT scores must come as close as possible to the Y scores in 
the sample, for the sample as a whole. The deviation of a given 
subject's YHAT from the subject Y is the subject's* e score. 
Thus, the weights are derived to minimize the e scores, or, more 
specifically to minimize the sum of the squared e scores, also 
called SOS error or SOS within. 

The multiple correlation, R, is the correlation between the 
predicted scores, i.e., the YHAT scores, and the observed 
criterion scores, i.e., the Y scores. R 2 represents the 
proportion of variance of the dependent variable that is shared 
with the independent variables as a set, the set being 
represented by the YHAT scores (Pedhazur, 1982). 

It should be noted, however, that R 2 is the maximum 
mathematical value for the given sample due to "overfitting, n or 
the capitalization on sampling error in the derivation of the 
"optimal" weights for the sample data (Mitchell & Klimoski, 



1986). Mosier (1951) describes this chance factor as involving 
the idiosyncratic characteristics of the sample. Pedhazur (1982) 
explains that such overfitting is due to the treatment of zero- 
order correlations as being error-free, which is never true. 
Pedhazur further contends that the degree of overestimating R is 
affected by the ratio of the number of independent variables to 
sample size and that as the number of independent variables 
approaches the sample size, the likelihood of overestimating R 
(and R 2 ) gets larger. 

These difficulties pose problems in applying the regression 
equation to other samples. If the derived sample weights were 
applied to the predictor scores of another sample, the resulting 
multiple correlation between the predicted scores and the 
criterion scores of the second sample would almost always be less 
than the original multiple correlation (Pedhazur, 1982). In 
other words, shrinkage of R is certain to occur when the 
regression equation is applied across samples, and in fact, 
predictor variables that prove to be statistically significant in 
the derivation sample may "shrink" to nonsignificant values in 
the second sample. In this context, if invariance represents 
stability across samples, then it can be viewed as inversely 
related to the degree of shrinkage. The smaller the degree of 
shrinkage, the greater is the invariance, and thus the more 
generalizable is the regression equation. 

■flie Pou.frle Cross-validation Procedure 

"Double cross-validation" is an empirical invariance 
procedure used in multiple regression that essentially involves 



the use of two samples or subsamples to produce two pairs of 
regression equations from which respective shrinkages can be 
determined. Double cross-validation offers a greater level of 
confidence in generalizability when applied to two separate 
samples than when applied to two subsamples creating by splitting 
a single sample; however, if the sample size is sufficiently 
large, then two randomly assigned subsamples can provide a fair 
estimate of result reproducibility. And ueing some estimate of 
result stability or invariance is almost always better than 
failing to conduct any empirical evaluation of result 
replicability. 

Because educational and psychological researchers often 
encounter problems in obtaining data from more than one sample, 
the use of two random subsamples derived by splitting a sample 
may be a useful procedure. For this reason, the procedure for 
double cross-validation described in the present paper is offered 
in the context of comparing two subsamples rather than two 
separate samples. 

If a single sample is used, the first step in double cioss- 
validation requires that the sample be divided randomly into two 
subsamples (e.g., 50% + 50%; 51% + 49%; 75% + 25%). Fish (1986) 
contends that the subsamples should be unequal, for if the 
results of a disproportionately smaller subsample (e.g., 25%) 
prove to be replicable, one might be willing to vest even more 
confidence in the generalizability of the results. However, such 
"acid" tests may be counterproductive or over conservative. Some 
researchers will want to use subsamples of more nearly equal 
size, to provide greater likelihood that invariance will be 
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found. 

After creating the two subsamples, the variables within the 

two sets are converted to z-scores. The z -scores are 

standardized with the means and SDs of subsample l for the 

subsample 1 data, and with the means and SDs of subsample 2 for 

the subsample 2 data* Separate regression analyses are also 

performed with each subsamples' data. As a result, separate b^ta 

weights are derived for each subsample's set of z-scores. The 

resulting betas are then used to compute predicted Y values, YHAT, 

for the cases in each subsample, such that: 

YHAT = beta z + beta z + beta z + ... beta z 

11 11 11 12 12 13 13 lj lj 

YHAT « beta z + beta z + beta z +... beta z 
22 21 21 22 22 23 23 2j 2j 

In this notation, the first subscript for the YHATs indicate 
which sample's z-scores were used to calculate the YHATs, while 
the second subscript for YHAT indicates which sample's beta 
weights were employed- The first subscript for the beta weights 
and for z-scores indicate which sample yielded the weights or the 
z-scores, while the second subscript indicates the sequence 
number of the predictor, ranging from l through the j th predictor 
variable. 

With two composite scores, YHAT^ and YHAT 22 , thus 
determined, the next procedure is to "cross" beta weights and 
compute two new sets of predicted YHAT values, namely YHAT 12 and 
YHAT 21 , by the same methods. In computing YHAT 12 , the betas from 
the regression of subsample 2 are applied to the z-scores of 
Subsample 1. Conversely, the betas of subsample 1 are applied to 



the z-scores of Subsample 2 in order to calculate YHAT 21 . Thus, 

these estimates take the form: 

YHAT - beta z + beta z + beta z +... beta z 
12 21 11 22 12 23 13 2j lj 

YHAT - beta z + beta z + beta z +... beta z 
21 11 21 12 22 13 23 lj 2j 

Upon completing the computation of the four sets of YHAT 
scores, two for subjects in each of the two subsample groups, 
various combinations of the scores can be correlated. The 
correlation of YHAT X1 with the Y scores of subjects in subsample 
1 will yield the R for that subsample. The correlation of YHAT 22 
with the Y scores of subjects in subsample 2 will yield the R for 
that subsample . 

Tho invariance of the results can be evaluated in either of 
two wsys. First, the shrinkage can be evaluated for each group 
as: 

2 2 
INV - R - R 

1 11 12 

2 2 
INV ■ R - R 

2 22 21 

The more closely these two shrinkage estimates approach zero, the 
greater is the degree of stability across the subsamples, and 
hence the more confidence the researcher can vest in the 
replicability of the results. It should be noted that R 2 X1 and 
R 22 win almost certainly be greater than R 2 12 and R 2 21 , 
respectively, since the first two R 2 's are the mathematical 
optimums for their respective subsamples. 

one aspect of this method of evaluating shrinkage, however, 
is that the result has no set metric. For example, shrinkage 
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from an R* of 70% to one of 60% is not the same as shrinkage from 
an R 2 of 10% to one of 0%, since the former result is still quite 
noteworthy, while the latter is not. These difficulties can be 
overcome by comparing the r 2 of Y with YHAT U (i.e., the actual 
R 2 of subsample 1) against the r 2 of Y with YHAT 12 , and by 
comparing the r 2 of Y with YHAT 22 (i.e., the actual R 2 of 
subsample 2) against th«* r 2 of Y with YHAT 21 . These two 
correlation coefficients can be called invariance coefficients. 
The more closely these invariant r's approach one, the greater 
is the degree of confidence obtained in the stability of the 
regression equation across different configurations of subjects. 
Numerical Examples of Double Crosa-Validati^ 

In order to illustrate the double cross-validation 
procedure, two numerical examples will be presented. The first 
example uses a smaller heuristic data set (n * 20) of 5 
predictors to illustrate an invariance situation in which the 
weights appear to be different across subsamples, but in fact 
yield reasonably equivalent results across at least one 
subsample. This example is utilized to drive home the point that 
weights must be compared empirically, rather than subjectively 
(Thonpson, 1989). 

In the second example, the data are drawn from an actual 
study of life satisfaction in elderly nursing home residents, in 
this study, 200 nursing home residents were administered a life 
satisfaction inventory which consisted of 8 subscales. These 8 
subscales were used as independent variables to predict overall 
nursing home satisfaction as measured on a self-report Likert 
scale. 




Example l. The first step in the cross-validation procedure 
requires that the data be randomly sorted into two relatively 
equal subsamples. in this example, Subsample l contains 11 
subjects, and Subsample 2 has 9 subjects. The hypothetical data 
are presented in Table 1. The SAS program used to analyze the 
data is presented in Appendix A. 



Insert Table 1 about here 



In the second step, the scores within each subsample were 
converted to z-scores by first computing means and variances of 
the independent variables, X1-X5, in an initial computer run. 
These results were then used in the second computer run to obtain 
z-scores, Z1_X1-Z1_X5 and Z2_X1-Z2JC5, for both samples. In 
addition, separate regression analyses were conducted within each 
subsample in order to generate each subsample 's respective beta 
weights. These results are presented in Tables 2 and 3. 



Insert Tables 2 and 3 about here 



At this point in the analysis z-scores and betas have been 
computed for subsample l and subsample 2. From these values, 
estimates of Y can be predicted for each of the subsamples: 

Let YHAT_11 = predicted y scores for Subsample 1 
YHAT_22 ■ predicted Y scores for Subsample 2 

therefore: 

YHAT_11 * (0.99311 x Z1_X1) + (-0.17191 x Z1JC2) 



0 

ERIC 



9 

12 



+ (0.25976 X Z1JC3) + (-0.30172 X Z1_X4) 

+ (0.07974 X Z1JC5) 
YHAT_22 - (-0.47331 X Z2JC1) + (0.39201 X Z2JC2) 

+ (0.74884 X Z2JC3) + (0.68759 X Z2_X4) 

+ (0.60224 X Z2_X5) 
The third step is to "cross" the betas in subsample 2 with 
the z-scores in subsample l fn order to compute the invariance 
coefficient for the subsample 2 weights. Likewise, the betas in 
subsample 1 are crossed with the z-scores in subsample 2 in order 
to calculate the invariance for the subsample 1 weights, as 
follows: 

Let YHAT_12 = invariance composite scores for Subsample 1 
YHAT_21 ■ invariance composite scores for Subsample 2 
therefore: 

YHAT_12 -= (-0.47831 X Z1_X1) + (0.39201 X Z1JC2) 
+ (0.74884 X Z1_X3) + (0.68759 X Z1_X4) 
+ (0.60224 X Z1_X5) 
YHAT_21 - (0.99311 X Z2JC1) + (-C. 17191 X Z2JC2) 
+ (0.25976 X Z2_X3) + (-0.30172 x Z2JC4) 
+ (0.07974 X Z2_X5) 
As a result of these computations, each subsample has two 
sets of predicted YHAT scores, namely YHAT_1 1 and YHAT_12 for 
subsample and YHAT_2 1 and YnAT_22 for subsample 2. These values 
are then correlated to yield the invariance coefficients. The 
results for the example are presented in Table 4. 



Insert Table 4 about here 
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The resulting correlation of YHAT_11 with YHAT_12 revealed 
an invariance estimate of .95, indicating that the weights from 
the two subsamples yield very similar estimates of YHAT. At 
first pale this result may seem surprising, since the beta 
weights ("STANDARDIZED ESTIMATES") presented in Tables 2 and 3 
appear to be very different, e.g., +.079 for ZX5 in subsample l 
versus +.602 for ZX5 in subsample 2. 

However, the invariance correlation of YHAT_22 with YHAT_21 
was .54, indicating that the sets of weights were not equally 
effective when they were both to the data for subsample 2. This 
finding illustrates the utility of "doubly" cross-validating, 
both ways. The discrepancy between these two invariance 
estimates would contra indicate stability of predictor weights 
across samples. 

Example 2. Regarding the nursing home satisfaction example, 
the nursing home data were randomly sorted into two relatively 
equal subsamples. Subsample 1 contained 101 subjects, and 
Subsample 2, 99 subjects. Subsequently, within each subsample, 
z-scores and beta weights were obtained. The results for the two 
subsamples are presented in Tables 5 and 6. 



Insert Tables 5 and 6 about here 



From these values, nursing home satisfaction (NH_SAT) can be 
predicted for each of the subsamples: 

Let NH_SATll « predicted NH_SAT for Subsample l 
NH_SAT22 « predicted NH__SAT for Subsample 2 
therefore: 



11 
14 



NH_SAT11 = (.80567 X Z1J42AN) + (-.09939 X Zl_GOAL) 
+ (-.01853 X Zl_SOCL) + (.10754 X Z1_YEAR) 
NH_SAT22 = (.74817 X Z2J4EAN) + (-.03272 x Z2JSOPX,) 
+ (-.13812 x Z2_SOCL) + (.04420 X Z2_YEAR) 
The cross-validation YHATs were computed as: 

Let NH_SAT12 ■ invariance composite scores for Subsample 1 
NH_SAT21 = invariance composite scores for Subsample 2 
therefore: 

NH_SAT12 ■ (.74817 X Z1_MEAN) + (-.03272 X Zl_GOAL) 
+ (-.13912 X Zl_SOCL) + (.04420 X Z1_YEAR> 
NH_SAT21 = (.80567 X Z2_MEAN) + (-.09939 X Z2_G0AL) 
+ (-.01853 X Z2_S0CL) + (.10754 X Z2_YEAR) 
As a result ot these computations, each subsample has two sets of 
predicted NH_SAT scores, namely NH_SAT11 and NH_SAT12 for 
subsample 1 and NH__SAT22 and NH_SAT21 for subsample 2. 

The invariance coefficients for this analysis are presented 
as a part of Table 7. The correlation of NH_SAT11 with NH_SAT12 
yielded an invariance estimate of .98, indicating that the 
weights from the two subsamples were very stable. Similarly, the 
correlation of NH_SAT22 with NH_SAT21 yielded an invariance 
estimate of .98, thus indicating that weights in the second 
cross-validation performed very well also. Such a high degree of 
stability yields a large degree of confidence that the original 
regression equation for the full sample is an accurate predictor 
of nursing home satisfaction in this sample and that the equation 
is fairly stable across samples. 

Insert Table 7 about here 
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And it is always the aquation based on tits full sample tbat 
is ultinataly the basis for interpretation. The subsampls 
snalysss ars conducted to got a fesl for ths stability of the 
full sanpls results, and not to provide a basis for direct 
interpretation . 

Conclusion 

Double cross-validation is a method by which investigators 
using multiple regression analyses can simultaneously conduct two 
estimates of invar iance either across two separate samples or twc 
subsamples drawn from one sufficiently large sample. The 
advantage of using double cross-validation is that it provides a 
second "replication" of the results which is useful in comparing 
to the first set of results. 

In educational and psychological research, the importance of 
a study is typically determined by some test of statistical 
significance. Whereas these statistical significance tests are 
widely accepted as measures of importance, they are not very 
dependable indicators of result reproducibility. A much more 
accurate estimation of generalizability would be to empirically 
test the findings across samples and to determine the degree of 
stability across these samples rather than relying solely upon 
tests of significance to indicate reproducibility. 
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Table 1 

Hypothetical Data for Example l (n=20) 



ID 


GROUP 


Y 

YHAT 11 


XI 

YHAT 12 


X2 

YHAT 21 


X3 

YHAT 22 


X4 




X5 


1 




1 . 4494 
1.6706 


2.6535 
2.8344 


1.6129 

• 


1.4791 

• 


0.8249 


1. 


3615 


2 




0.1235 
0.0358 


0.9332 
-0.0791 


1.0165 

• 


-0.4782 

• 


0.3678 


0. 


0549 


3 




-0.1411 
0.2184 


0.5796 
0.2745 


-0.9072 

• 


0. 3909 

• 


0.9591 


-0. 


5203 


4 




-0.6855 
-0.5731 


-0.7859 
-2.0287 


0.4845 
• 


-0.9226 

• 


-1.7435 


-0. 


7351 


5 




1. 1238 
0.6294 


1.2541 
1.0881 


0.7067 

• 


0.4088 

• 


0.4282 


0. 


6427 


6 




0.3943 
0.2953 


-0.5134 
0.6620 


-0.4392 

• 


0.7841 


-1. 1065 


0. 


6111 

W A 40* A 


7 




-2.2715 
-0.9373 


-1.7856 
-1.1105 


-1. 5240 

mm * w Wm * W 
• 


-0.3698 

• 


-0.6803 


-0. 


6170 


8 




-1.7230 
-0.8318 


-1. 0883 
-1.5652 


-0 . 6883 
• 


-1.5188 


-1. 1091 

mm . Jm* W if A 


0. 


7939 

$ mT mr mf 


9 




-1.5258 
-0.8131 


-0.8672 
-1.1258 


-1.2947 

• 


-0. 1193 

• 


0.6647 


-1. 


9800 


10 




1.0084 
0.4423 


0.9381 
1.5776 


-1. 1084 


0.5831 


1.7002 


0. 


7223 


11 




-0.5916 
-0.1362 


-0.4931 
-0.5275 


-1 . 3715 

• 


-0.3252 


-0.4884 


0. 


4942 


12 


2 


-1.8914 

• 


-0.5908 

• 


-0.0792 
-0.2533 


-0.7090 
-1.1022 


-1.6626 


0. 


2773 


13 


2 


-0.5829 

• 


0.2041 

• 


0.1175 
0.1969 


0.0811 
0.0285 


0.8644 


-0. 


6684 


14 


2 


1.3889 

• 


0.3041 

• 


-0.2253 
0.6474 


-0.4971 
0.5248 


0.3586 


2. 


0554 


15 


2 


-1.4594 

• 


-1.3000 

• 


0.0148 
-1.7938 


-1.1782 
-0.4300 


-0.4230 


-0. 


5962 


16 


2 


2.8002 

• 


1.3113 

• 


2 . 3249 
1.5619 


1.5848 
1.6242 


0.9300 


0. 


0506 


17 


2 


-0.6989 

• 


-0. 5218 

• 


-0.4050 
0.0817 


1.2660 
-0.9846 


-1. 3866 




3967 


18 


2 


0.8433 

• 


-0.2492 

• 


-0.5100 
0.0437 


0.9812 
O.3601 


0.3150 


-0. 


1616 


19 


2 


0.5183 

• 


0.8607 

• 


0.1861 
1.3755 


-O.0563 
0.1125 


0.3345 


1. 


1504 


20 


2 


-0.0161 

• 


-0.7832 

• 


0.5174 
-1.8595 


-2.1399 
-0.1332 


1.2817 


-1. 


0705 



Note. Y is the dependent variable. XI to X5 are the predictor 
variables. GROUP is the hypothetical variable randomly created 
to divide the sample into two subsamples. The YHAT values for 
each case are also presented. 
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Table 2 

SAS Results for Subsample 1 in Example 1 



CORRELATION FOR GROUP 1 



Y XI X2 X3 X4 X5 

Y 1.00000 0.89693 0.61959 0.75530 0.51278 0.61643 

XI 0.89693 1.00000 0.70185 0.70028 0.66596 0.54554 

X2 C. 61959 0.70185 1.00000 0.28096 0.03819 0.41411 

X3 0.75530 0.70028 0.28096 1.00000 0.59227 0.33971 

X4 0.51278 0.66596 0.03819 0.59227 1.00000 0.07339 

X5 0.61643 0.54554 0.41411 0.33971 0.07339 1.00000 



REGRESSION OF GROUP 1 
DEP VARIABLE: Y 
ANALYSIS OF VARIANCE 



SOURCE 

MODEL 
ERROR 
C TOTAL 



SUM OF 
DF SQUARES 

5 13.22347310 
5 1.89135420 
10 15.11482730 



MEAN 
SQUARE 

2.64469462 
0.37827084 



F VALUE 
6.992 



PROB>F 
0.0262 



ROOT MSE 0.6150373 R-SQUARE 0.8749 

DEP MEAN -0.2581 ADJ R-SQ 0.7497 

CV. -238.294 

PARAMETER ESTIMATES 

PARAMETER STANDARDIZED 
VARIABLE DF ESTIMATE ESTIMATE 



INTERCEP 


1 


-0. 


40105606 


0 


XI 


1 


0. 


93953203 


0.99311254 


X2 


1 


-0. 


19418815 


-0.17190653 


X3 


1 


0. 


37886145 


0.25975559 


X4 


1 


-0. 


34675263 


-0.30172221 


X5 


1 


0. 


10282402 


0.07973646 
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Table 3 

SAS Results for Subsample 2 in Example 1 



CORRELATION FOR GROUP 2 







Y 




XI 




Y 


1. 


00000 


0. 


78854 


0. 


XI 


0. 


78854 


1. 


00000 


0. 


X2 


0. 


60199 


0. 


58245 


1. 


X3 


0. 


47657 


0. 


56626 


0. 


X4 


0. 


63885 


0. 


44290 


0. 


X5 


0. 


40768 


0. 


51467 


-0. 



REGRESSION OF GROUP 2 
FOR CROSS-VALIDATION 
DEP VARIABLE: Y 



X2 X3 X4 X5 

60199 0.47657 0.63885 0.40768 

58245 0.56626 0.44290 0.51467 

00000 0.22532 0.47329 -0.00927 

22532 1.00000 -0.11255 -0.02053 

47329 -0 11255 1.00000 0.10272 

00927 -0.02053 0.10272 1 00000 



ANALYSIS OF VARIANCE 

SUN OF 

SOURCE DF SQUARES 



MODEL 
ERROR 
C TOTAL 



5 15.48388211 
3 1.71140563 
8 17.19528774 



MEAN 
SQUARE 

3.09677642 
0.57046854 



F VALUE 
5.428 



PROB>F 

0.0972 



ROOT MSE 0.7552937 R-SQUARE 0.9005 

DEP MEAN 0.1002222 ADJ R-SQ 0.7346 

C.V. 753.619 



PARAMETER ESTIMATES 

PARAMETER STANDARDIZED 
VARIABLE DF ESTIMATE ESTIMATE 



INTERCEP 


1 


-0.08466587 


0 


XI 


1 


-0.8444S003 


-0.47831374 


X2 


1 


0.67590502 


0.39200759 


X3 


1 


0.90571977 


0.74884287 


X4 


1 


0.98266984 


0.68758602 


X5 


1 


0.80860873 


0.60223747 



Y 

YHAT_11 
YHAT 12 
YHAT~21 
YHAT 22 



Table 4 

Invariance Results for Example l 



Y 

1.00000 
0.93137 
0.87360 
0.63737 
0.93282 



YHAT_11 
0.93137 
1.00000 
0.94721 



YHAT 12 
0.87360 
0.94721 
1.00000 



YHAT_21 
0.63737 



1.09000 
0.53884 



YHAT_22 
0.93282 



0.53884 
1.00000 



Mote. The r between Y and YHAT_ll is the R for subsample 1; the r 
between Y and YHAT_22 is the R for subsample 2. The r's between 
YKAT_11 and YHAT_12 and between YHAT 22 and YHAT 21 are the 
invariance coefficients. ~ " 
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Regression Results 



Table 5 

for Subsample l in Example 2 



CORRELATION FOR GROUP 1 



NH SAT 

MEANING 

GOALS 

SOCIAL 

YEARS 



NH__SAT 
1.00000 
0.75018 
0.34277 
0.25527 
0.17272 



MEANING 
0.75018 
1.00000 
0.56846 
0.34327 
0.06857 



GOALS 
0.34277 
0.56846 
1.00000 
0.15091 
-0.12116 



SOCIAL 
0.25527 
0.34327 
0.15091 
1.00000 
0.11374 



YEARS 
0.17272 
0.06857 
-0.12116 
0.11374 
1.00000 



REGRESSION FOR GROUP 1 

BETAS USED IN ESTIMATING NH_SAT 

FOR CROSS-VALIDATION 



DEP VARIABLE: NH SAT 
ANALYSIS OF VARIANCE 

SUM OF 

SOURCE DF SQUARES 

MODEL 4 150.39325 

ERROR 96 107.05230 

C TOTAL 100 257.44554 



MEAN 
SQUARE 

37.59831128 
1.11512812 



F VALUE 
33.717 



PROB>F 
0.0001 



ROOT MSE 1.055996 R-SQUARE 0.5842 

DEP MEAN 5.366337 ADJ R-SQ 0.5668 

C.V. 19.67816 

PARAMETER ESTIMATES 

PARAMETER STANDARDIZED 
VARIABLE DF ESTIMATE ESTIMATE 



INTERCEP 


1 


-0.97756288 


0 


MEANING 


1 


0.41552367 


0.80566633 


GOALS 


1 


-0.06019197 


-0.0993907* 


SOCIAL 


1 


-0.009329782 


-0.01852470 


YEARS 


1 


0.03907011 


0.10754426 
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Regression Results 



Table 6 

for Subsaiople 2 in Example 2 



CORRELATION FOR GROUP 2 



NH SAT 

MEANING 

GOALS 

SOCIAL 

YEARS 



NH SAT 
1.00000 
0.68295 
0.26892 
0.13953 
0.10225 



MEANING 
0.68295 
1.00000 
0.46981 
0.38664 
0.08048 



GOALS 
0.26892 
0.46981 
1.00000 
0.36867 
0.02391 



SOCIAL 
0.13953 
0.38664 
0.36867 
1.00000 
0.00997 



YEARS 
0.10225 
0.08048 
0.02391 
0.00997 
1.00000 



REGRESSION FOR GROUP 2 

BETAS USED IN ESTIMATING NH_SAT 

FOR CROSS-VALIDATION 

DEP VARIABLE: NH_SAT 

ANALYSIS OF VARIANCE 

SUM OF 

SOURCE DF SQUARES 



MEAN 
SQUARE 



MODEL 
ERROR 
C TOTAL 

ROOT MSE 
DEP MEAN 
C.V. 



4 
94 
98 



109.26900 
114.91282 
224.18182 



1.105657 
5.242424 
21.09056 



27.31724916 
1.22247682 



R- SQUARE 
ADJ R-SQ 



F VALUE 
22.346 



PROB>F 
0.0001 



0.4874 
0.4656 



PARAMETER ESTIMATES 

PARAMETER 

VARIABLE DF ESTIMATE 



INTERCEP 1 

MEANING 1 

GOALS 1 

SOCIAL 1 

YEARS 1 



1.28855291 
0.30692720 
-0.02182330 
-0.05702813 
0.01538654 



STANDARDIZED 
ESTIMATE 



0.74817030 
■0.03272233 
•0.13811796 
0.04419563 



Table 7 

Invariance Results for Example 2 



NH SAT NH_SAT11 
NH_SAT 1.00000 0.76367 
NH SAT11 0.76367 1.00000 
NH~SAT12 0.74721 0.97975 
NH~SAT21 0.68422 
NH SAT22 0.69785 



NH SAT 12 NH SAT21 NH_SAT22 
0774721 0768422 0.69785 
0.97975 
1.00000 

1.00000 0.98171 
0.98171 1.00000 
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Appendix A: 
SAS Program to Analyze Table 1 Data 

INFILE ABC * 

INPUT ID GROUP Y XI X2 X3 X4 X5; 

PROC REG; 

MODEL Y - XI X2 X3 X4 X5 / STB; 

TITLE 'REGRESSION FOR ALL DATA' ; 
PROC corr; 

VAR Y XI X2 X3 X4 X5; 

TITLE ' CORRELATION FOR ALL DATA' ; 



DATA GROUPl; 

SET STATS; 
IF GROUP«l; 



Zl XI 
Z1J2 
Z1JC3 
Z1_X4 
Zl X5 



(XI 
(X2 
(X3 
(X4 
(X5 



- 0.0750) / 1.6888; 
4- 0.3193) / 1.1845; 
+ 0.0080) / 0.7105; 
+ 0.0166) / 1.1444; 

- 0.0753) / 0.9089; 



+ 
+ 

+ 



(.17191*Z1 X2) + 
(.07974*Z1~X5) ; 
(.39201*Z1~X2) + 
(,60224*Z1~X5) ; 



YHAT 11 * (.99311*Z1 XI) 

- (,30172*Z1~X4) 
YHAT 12 «(-.47831*Zl~Xl) 

+ (.68759*Z1~X4) 
PROC UNIVARIATE; 

VAR Y XI X2 X3 X4 X5; 

TITLE 'UNIVARIATE STATISTICS FOR GROUP 
PROC REG; 

MODEL Y - XI X2 X3 X4 X5 / STB; 

TITLE 1 'REGRESSION OF GROUP 

TITLE2 'FOR CROSS-VALIDATION'; 
PROC CORR; 

VAR Y XI X2 X3 X4 X5; 



i'; 



(,25976*Z1_X3) 
(.74884*Z1_X3) 



TITLE ' CORRELATION FOR GROUP 1 



r • 



DATA GROUP2; 

SET STATS; 
IF GROUP«2; 

Z2_X1 ■ (XI + 0.0850) / 0.6895; 

Z2 X2 - (X2 - 0.2157) / 0.7230; 

Z2^X3 ■ (X3 + 0.0742) / 1.4693; 

Z2JC4 o (X4 - C.0680) / 1.0524; 

Z2_X5 = (X5 + 0.0400) / 1.1923; 

YHAT 21 ■ (.99311*Z2 XI) - ( . 17191*Z2_X2) + ( . 25976*Z2_X3) 

- (.30172*Z2"X4) + (.07974*Z2 X5) ; 
YHAT 22 =(-.47831*Z2 XI) + ( . 39021*Z2JC2) + ( . 74884*Z2_X3) 

+ (.68759*Z2""X4) + (.60224*Z2 X5) ; 
PROC UNIVARIATE; 

VAR Y XI X2 X3 X4 X5 • 

TITLE 'UNIVARIATE STATISTICS FOR GROUP 2'' 
PROC REG; 

MODEL Y - XI X2 X3 X4 X5 / STB; 
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TITLE1 DEGRESSION OF GROUP 2'; 

TITLE2 'FOR CROSS-VALIDATION'; 
PROC CORR; 

VAR Y XI X2 X3 X4 X5; 

TITLE ' CORRELATION FOR GROUP 2'; 
DATA REG ALL; 

SET GROUP1 GR0UP2; 

PROC CORR; 

VAR Y YHAT IX YHAT_12 YHAT_21 YHAT_22 ; 
TITLE ' INVARIANCE RESULTS'! 

PROC PRINT; 

VAR ID GROUP Y XI X2 X3 X4 X5 YHAT__11 YHAT 12 YHAT 21 YHAT 22; 
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